feat: trust console + memory check layer (v0.15.0) by codejunkie99 · Pull Request #40 · codejunkie99/agentic-stack

codejunkie99 · 2026-05-05T18:23:48Z

Summary

New agentic-stack doctor, tui, memory ..., verify, team ... commands inspect the file-backed .agent/ data layer with no daemon. Same normalized data model powers text, JSON, and a read-only stdlib curses TUI.
Status glyphs (✓ / ! / ✗) replace PASS/WARN/FAIL in the TUI; encoding-aware fallback to + / ! / x under non-UTF-8 stdout.
Three codex-review-surfaced fixes: adapter conformance now requires every listed file, doctor --json preserves non-zero exit on failure, glyph fallback for ASCII terminals.
27/27 standalone regression checks (verify_trust_console.py).
Branch was 72 behind master; this PR includes the integration merge with conflict resolutions documented in commit bddc63b.

Test plan

python3 verify_trust_console.py — 27/27 pass
python3 agentic_stack_cli.py doctor — exit 0
python3 agentic_stack_cli.py tui --plain — renders glyphs
python3 agentic_stack_cli.py verify --all --json — exit 0
Fresh install claude-code <tmp> --yes + verify claude-code — all six conformance dimensions pass
CI on the branch
Brew formula bump after v0.15.0 tag (follow-up commit)

Baseline commit so subsequent per-bug fixes have minimal diffs. No behavior changes; just brings these files under version control: - .agent/harness/runtime.py - .agent/harness/control_plane.py - .agent/harness/lesson_store.py - .agent/tools/instances.py

Previously, the second positional arg was assigned to TARGET unconditionally, so the documented form `agentic-stack claude-code --yes` wrote into a literal `--yes/.agent` directory. Now flags are filtered out of positional parsing, TARGET defaults to $PWD when only an adapter and flags are passed, and unknown -flags are rejected rather than silently consumed. Refs: HIGH_PRIORITY_BUG_REPORT.md (P0)

`_parse_args()` previously treated every arg starting with `-` as a flag, silently dropping target paths that begin with `-` and falling back to `os.getcwd()`. Now `--` ends flag parsing, only known flags (--yes/-y/--force/--reconfigure) are consumed as flags, and unrecognized -tokens warn-and-treat-as-path instead of being eaten. Refs: HIGH_PRIORITY_BUG_REPORT.md (P0)

`mark_worker_stopped()` previously left `active_instance` pointing at a stopped instance, so workers that exited via STOP/SIGINT/SIGTERM kept the registry routing future work to a dead instance. Now matches the CLI `stop_instance()` behavior: clears `active_instance` if it points at the instance being stopped, then persists. Refs: HIGH_PRIORITY_BUG_REPORT.md (P0)

Hook only checked `blocked_targets` and the `requires_approval` boolean, ignoring `blocked_patterns` and `requires_approval_patterns` from the shell schema. Now matches command strings against both pattern lists via re.search, blocks bad regex with stderr warnings (fail-soft), and runs before the legacy boolean and permissions.md keyword heuristics. Catches: `curl ... | sh`, `rm -rf /`, `git push --force`, etc. Refs: HIGH_PRIORITY_BUG_REPORT.md (Critical)

No CI previously ran the documented verifier scripts, so high-risk areas could regress without merge-time signal. Workflow runs on push and PR to master with three jobs: - verifiers (ubuntu): test_claude_code_hook.py, verify_codex_fixes.py, verify_instances.py - installer-smoke (ubuntu): exercises both `install.sh claude-code <path> --yes` and the documented no-path form `install.sh claude-code --yes`, asserting no literal `--yes/` directory is created - installer-windows-pwsh (windows): pwsh install.ps1 parity Refs: HIGH_PRIORITY_BUG_REPORT.md (P0)

Windows installer omitted the documented `pi` adapter from the usage comment, ValidAdapters list, and switch cases. Now mirrors install.sh: creates `<TARGET>/.pi/AGENTS.md` only if absent, then wires `.pi/skills` to `.agent/skills` via SymbolicLink, falling back to Junction, then a recursive copy. Safer than install.sh: an existing real `.pi/skills` directory is renamed to a timestamped `.bak-` rather than rm-rf'd. Refs: HIGH_PRIORITY_BUG_REPORT.md (P1)

Existing Homebrew test only used the explicit-path form `claude-code <path> --yes`, so it never exercised the broken documented `claude-code --yes` ordering. Pre-creating `testpath/.agent/memory/personal` also masked the install.sh skip-when-exists branch in the .agent copy. Now: removed the pre-creation, asserted `runtime.py` exists after the explicit-path install (full tree copy), then ran `claude-code --yes` inside a fresh subdir asserting no `--yes/` directory was created. Refs: HIGH_PRIORITY_BUG_REPORT.md (P1)

Manifest-provided names and precondition paths were joined under SKILLS_DIR / ROOT without containment checks, so a poisoned manifest entry with `../` could probe files outside the skill tree. Adds `_within(root, candidate)` resolve-and-relative-to check, regex validation for skill names, and per-file containment checks before opening SKILL.md / KNOWLEDGE.md. Bad entries warn to stderr and skip rather than crash the loader. Refs: HIGH_PRIORITY_BUG_REPORT.md (P1)

Concurrent `start` calls could both observe no live worker and spawn duplicate workers for the same queue (check then subprocess then mark-started left a TOCTOU window). Adds an fcntl exclusive non-blocking lock on `<runtime>/spawn.lock` held across re-check-spawn-mark, so a contended caller bails fast with "another spawn in flight". Liveness now also checks via `os.kill(pid, 0)` so a stale-but-non-None pid triggers respawn. Refs: HIGH_PRIORITY_BUG_REPORT.md (P1)

Required and optional sections previously appended unconditionally, with only matched-skills gated by the budget — so an oversized WORKSPACE.md or lessons file would blow past `budget` regardless. Now every append checks `_room()` first. Required sections (role, permissions, paths) are truncated with a marker rather than dropped; optional sections (lessons, episodes, skills) skip with an "[N items omitted]" marker. Reserves a per-required-section header floor so an early section cannot starve later ones. Returns a `_UsedTokens` int subclass exposing `.overflow` while preserving the `(ctx, used)` 2-tuple shape for existing callers. Refs: HIGH_PRIORITY_BUG_REPORT.md (P1)

`mark_graduated` / `mark_rejected` / `mark_reopened` joined raw `candidate_id` into paths without sanitization, so an id with `../` could resolve outside the candidates directory. Adds module-level `_validate_candidate_id` (regex `^[a-zA-Z0-9_-]{1,128}$`) called at the top of each lifecycle entry point, plus `_ensure_within` realpath-containment defense-in-depth against symlink shenanigans. Non-atomic write fix is a separate commit. Refs: HIGH_PRIORITY_BUG_REPORT.md (P1)

`graduate.py` joined raw `candidate_id` into paths from sys.argv, so a caller could probe candidate-shaped JSON outside the candidate dir. Validates candidate_id at the CLI entry point right after parse_args (rejects with exit code 4) and adds a `_safe_candidate_path` helper that re-validates plus realpath-checks containment under CANDIDATES_DIR. Imports `_validate_candidate_id` from review_state when available, falls back to a local copy with the same regex. Refs: HIGH_PRIORITY_BUG_REPORT.md (P1)

Duplicate-detection callers in graduate.py and auto_dream.py read LESSONS.md, which is rendered accepted-only — so a provisional lesson could be re-staged or re-graduated as if novel. Adds `render_dedup_text()` and `_load_all_for_dedup()` that include every lesson regardless of status (annotated with the real status), and points the two prefilter call sites at the new function. The accepted-only `render_visible_lessons_md` is unchanged so agent context keeps the same trust boundary. Refs: HIGH_PRIORITY_BUG_REPORT.md (High)

`_write_entries()` did a direct truncate-and-rewrite on AGENT_LEARNINGS.jsonl, so a crash, disk-full, or concurrent hook append during run_dream_cycle could lose the entire log. Now snapshots prior state to `.bak`, writes to `.tmp`, fsyncs, then `os.replace`s atomically. Cleans up `.tmp` on failure with original file intact. `_load_entries(report_malformed=True)` surfaces bad-line counts via stderr from `run_dream_cycle` so corruption isn't silent. Refs: HIGH_PRIORITY_BUG_REPORT.md (P1)

`stage()` deterministically computed the candidate id and wrote a fresh record with `rejection_count: 0` and an empty decisions list — so re-teaching a previously-rejected candidate erased its rejection history and made churn look novel. Now `_find_prior` checks candidates/, candidates/rejected/, and candidates/graduated/. If a non-provisional graduated record exists, re-staging refuses with exit 3. Otherwise the new record preserves `rejection_count`, `staged_at`, and the prior decisions list, appending a fresh `staged` or `re-staged` entry. The old rejected copy is removed once the new staged file lands so the candidate lives in exactly one location. Refs: HIGH_PRIORITY_BUG_REPORT.md (P1)

Auto-promoted candidates were written via direct `open(path, "w")`, so an interruption mid-write left a partial file that the listing loop silently skipped. Adds `_atomic_write_json()` helper using `open(path+".tmp","w")` -> flush -> fsync -> `os.replace(tmp, path)`, with a try/except cleanup of the temp file on failure. The single existing JSON write at line 188 now goes through it. Refs: HIGH_PRIORITY_BUG_REPORT.md (P1)

`render_lessons()` acquired the lock via `_locked_jsonl()` then called `load_lessons()`, which opened the same path on a separate UNLOCKED descriptor — so concurrent appends could produce torn reads despite the comment claiming the read-render-write cycle was locked. `load_lessons()` now accepts an optional keyword-only `fp=` argument that reads through a caller-provided locked descriptor. `render_lessons()` binds the locked fp from `_locked_jsonl()` and passes it in. Existing positional callers are unaffected. Refs: HIGH_PRIORITY_BUG_REPORT.md (High)

`post_execution.py` and `on_failure.py` both did raw `open(...,"a").write(json.dumps(...))` into AGENT_LEARNINGS.jsonl, so parallel hook invocations could interleave writes. The dream-cycle rewrite path also raced. Adds `append_episodic_entry()` to `_provenance.py` that takes `fcntl.flock(LOCK_EX)` on the open fd, writes the JSON line, flushes + fsyncs, releases on context exit. Both hooks now go through it. Documents the residual locking-model gap with auto_dream.py's atomic rename rewrite (different mechanisms; worst case is a single lost entry written between snapshot read and rename — acceptable). Refs: HIGH_PRIORITY_BUG_REPORT.md (P1)

`claim_next_job()` removed jobs whose JSON failed to parse, silently losing partial writes or manually corrupted entries with no diagnostic artifact. Now moves the file from `running/` to `failed/<job>.json` via `os.replace` (atomic) and writes a `<job>.json.error.json` sidecar containing the parse error, UTC ISO timestamp, and the original queued/ path. Stderr warning emitted so callers/operators can find the quarantine. Refs: HIGH_PRIORITY_BUG_REPORT.md (P2)

- adapter installed: require all listed files (was passing if any existed; caused false positives in `verify` for opencode/pi/etc.) - doctor --json: preserve non-zero exit code on failed checks (JSON path was always returning 0, masking failures in CI) - tui glyphs: swap PASS/WARN/FAIL text labels for ✓/!/✗ glyphs in curses + plain modes; encoding-aware fallback to +/!/x on non-UTF-8 terminals (PYTHONIOENCODING=ascii, LANG=C) - gitignore: exclude `.agent/memory/**/*.bak` runtime backups - tests: 6 new regression checks (27/27 passing) covering opencode partial install, hermes single-file, doctor --json broken-project exit code, and glyph fallback across encodings

Integrates 72 commits from master (v0.13.0..v0.14.0 + post-tag work) into the trust console branch, then resolves 11 file conflicts. Resolutions: - install.sh / install.ps1: took master (rewrote to thin Python dispatcher; feature's bash flag-parsing fixes are obsoleted). - Formula/agentic-stack.rb: combined master's harness_manager+scripts+ transfer test with feature's agentic_stack_cli.py + runtime.py + no-path test. Wrapper still delegates to install.sh; trust console CLI is installed alongside but not the bin entrypoint (follow-up: integrate trust commands into harness_manager.cli). - README.md, CHANGELOG.md: combined entries from both sides. - .gitignore: combined; .bak exclusion added under master's structure. - .agent/tools/learn.py: kept feature's prior-record merge logic, adopted master's UTC timestamp. - .agent/tools/skill_loader.py: kept both feature's _SAFE_NAME_RE containment check and master's skill_enabled() guard. - .agent/harness/hooks/on_failure.py, post_execution.py: took master (uses _episodic_io.append_jsonl; feature's _provenance.append_episodic_entry is now redundant). - .agent/memory/auto_dream.py: took master (flock-based atomic writes supersede feature's tempfile+.bak approach). Verified post-merge: 27/27 regression checks pass; doctor, tui --plain, verify all exit 0. Known follow-ups (defer to post-merge): - Formula version/sha bump for v0.15.0 release tag (P1 from codex review). - Wire trust console commands into harness_manager.cli or update bin wrapper so `agentic-stack doctor` resolves to the trust console CLI.

codejunkie99 and others added 24 commits April 25, 2026 15:22

docs: add trust console tui design

4e3ad62

feat: add trust console tui

a5daba1

codejunkie99 merged commit cebe245 into master May 5, 2026
0 of 3 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: trust console + memory check layer (v0.15.0)#40

feat: trust console + memory check layer (v0.15.0)#40
codejunkie99 merged 24 commits into
masterfrom
feature/trust-console-tui

codejunkie99 commented May 5, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

codejunkie99 commented May 5, 2026

Summary

Test plan

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant